Main
Comparison of simple normalisation methods
Comparison of simple normalisation strategies employed. MA plots showing the changes in ER binding after 48 hours treatment with 100 nM fulvestrant. Three simple normalisation methods were applied to this data and compared to the raw count data. (A) Raw counts. (B) Reads Per Million (RPM) reads in peaks. (C) RPM aligned reads. (D) RPM total reads. Note that the highlighted peaks remain above zero under all three standard normalisations.
MA Plot of H2Av normalization
MA plots showing ER binding before and after treatment with fulvestrant including matched Dm H2Av spike-in control.} (A) Reads corrected to total aligned reads showed the same off-centre peak density as observed in Figure 1. Putative unchanged ER binding sites are within the red triangle. (B) Overlaying the MA plot combining the changes in chromatin binding of Hs ER (black) and Dm H2Av (blue). Dm peaks overlay the off-centre peak density. (C) Utilising the Dm H2Av binding events as a ground truth for 0-fold change, a linear fit to the log-fold change is generated and the fit is applied to adjust the Hs ER binding events.
RARA gene locus with CTCF Spike-in
Figure 3 can be viewed interactively on the USCS track.
MA Plot of CTCF Peaks
Comparison of the control regions used to normalise ER analysis before and after treatment. Dots highlighted in red are significant (FDR = 0.01). The CTCF peaks used for normalisation show no significant change in the number reads before and after treatment.
H2av With DiffBind
Comparison of DiffBind output before and after applying the corrected size factors from our pipeline generated from Drosophila spike-in control. (A) Analysis of ER binding before and after treatment with fulvestrant demonstrates that DiffBind’s default normalisation strategy is more effective than the DESeq2 default, but demonstrates a bias between samples. (B) Applying the correct size factors from our DESeq2 pipeline reduces the bias in the analysis (Data: SLX-8047).
Linear model
Comparison of mean counts in CTCF peaks before and after treatment. If the samples have no systematic bias before and after treatment then the linear fit would be expected to have a gradient of 1. Here, we establish that the gradient is < 1, implying a systematic bias between samples. The read counts in the treated samples peaks are corrected (blue), removing the bias, and resulting in a new gradient of 1.
Comparison of CTCF and H2Av normalisation methods
Comparison of normalisation methods using consensus peak set. (A) The analysis for the CTCF normalised (blue) and H2Av normalised (green) dataset using an ER consensus peak set of 10,000 peaks were formatted as an MA plot and overlaid. This recovered the low-fold change higher-intensity peaks that were not visible in Figure ef{fig:ERCTCF}A and both datasets showed a similar distribution. (B) Comparison of fold-change values for individual ER binding sites between two datasets showed that the inclusion of these sites did not appear to affect the correlation (r = 0.77).